Epigenome-wide association study of diabetic chronic kidney disease progression in the Korean population: the KNOW-CKD study

Since the etiology of diabetic chronic kidney disease (CKD) is multifactorial, studies on DNA methylation for kidney function deterioration have rarely been performed despite the need for an epigenetic approach. Therefore, this study aimed to identify epigenetic markers associated with CKD progression based on the decline in the estimated glomerular filtration rate in diabetic CKD in Korea. An epigenome-wide association study was performed using whole blood samples from 180 CKD recruited from the KNOW-CKD cohort. Pyrosequencing was also performed on 133 CKD participants as an external replication analysis. Functional analyses, including the analysis of disease-gene networks, reactome pathways, and protein–protein interaction networks, were conducted to identify the biological mechanisms of CpG sites. A phenome-wide association study was performed to determine the associations between CpG sites and other phenotypes. Two epigenetic markers, cg10297223 on AGTR1 and cg02990553 on KRT28 indicated a potential association with diabetic CKD progression. Based on the functional analyses, other phenotypes (blood pressure and cardiac arrhythmia for AGTR1) and biological pathways (keratinization and cornified envelope for KRT28) related to CKD were also identified. This study suggests a potential association between the cg10297223 and cg02990553 and the progression of diabetic CKD in Koreans. Nevertheless, further validation is needed through additional studies.

Data source and study population. The KNOW-CKD cohort was used to perform an EWAS for diabetic CKD progression. The KNOW-CKD study is a prospective multicenter cohort involving 2238 participants with specific causes of chronic kidney disease (CKD) grouped into glomerulonephritis (GN) (n = 810), diabetic nephropathy (DN) (n = 519), hypertensive nephropathy (HTN) (n = 409), polycystic kidney disease (PKD) (n = 364), and unclassified (n = 136) (Supplementary Fig. 1) 10 . Subgroups were defined by pathologic diagnosis, where available, otherwise by clinical diagnosis 11 . GN was identified by the presence of glomerular hematuria or albuminuria, with or without an underlying systemic disease. DN was diagnosed based on albuminuria in individuals with type 2 diabetes mellitus and diabetic retinopathy. HTN was determined by hypertension history and absence of systemic illness associated with renal damage. PKD was diagnosed using unified ultrasound criteria. Other causative diseases were classified as 'unclassified. ' The KNOW-CKD cohort has been described in detail elsewhere 11 .
The population was divided into non-progression and progression, based on the eGFR slope of − 2.6 ml/ min/1.73 m 2 /year, which was calculated as the median the eGFR slope in the DN from the KNOW-CKD cohort. In addition, the threshold of eGFR slope for HTN was − 2.1 ml/min/1.73 m 2 /year, while it was − 2.6 ml/min/1.73 m 2 /year for DN in a previous study 12 .
This study was a follow-up to the genome-wide association study (GWAS) of the KNOW-CKD group 13 . Out of 519 participants with DN from KNOW-CKD, 434 individuals passed the quality control (QC) for GWAS ( Supplementary Fig. 1).
We estimated the appropriate number of participants to be included in the EWAS based on power calculation 14 . Statistical power was estimated from a minimum of 20 to a maximum of 200 participants, assuming a 1:1 ratio of progression to non-progression, 780,000 total CpG sites, 800 targeted CpG sites, minimum detection of |∆ M-value|= 0.01, limma method, FDR threshold of 0.05, and 100 simulations ( Supplementary Fig. 2).
Based on power calculation, we attempted to perform the EWAS on 200 participants matched by sex, age, and baseline eGFR between progression and non-progression groups. However, 20 out of 200 participants failed the final epigenomic sample QC process. Therefore, 180 participants (progression: 93, non-progression: 87) were included in the EWAS.
We also performed the pyrosequencing analysis as validation. A total of 78 individuals from DN in KNOW-CKD, excluding those in EWAS, matched by age, sex, and baseline eGFR who passed the QC of pyrosequencing (Supplementary Fig. 1; Supplementary Table 1) were selected. In addition, 55 individuals (41 progression, 14 non-progression) diagnosed with DN from biopsy in the Seoul National University Hospital (SNUH) Human Biobank were included for pyrosequencing analysis (Supplementary Fig. 1). Finally, pyrosequencing analysis was performed based on a total of 133 participants ( Supplementary Fig. 1).
Outcome measurement. The eGFR was calculated using the four-variable Chronic Kidney Disease Epidemiology Collaboration equation 15 . The eGFR slope in the KNOW-CKD cohort was calculated based on linear mixed models with random intercepts using MIXED procedures in SAS software (SAS Institute, Inc., Cary, North Carolina) 16 . The eGFR slope was estimated using creatinine values, measured at every time point; the initiation of the cohort, 6 months after the initiation of the cohort, and at least one follow-up since 2011, every 1-7 years. Only participants that had had eGFR measured at least three times were included. The LMM was fitted, where follow-up time was the dependent variable, and eGFR was the independent variable.
The fixed effect was the effect of "time" on eGFR. The fixed effect represents the average change in eGFR over time in all participants. The random effects are the participant-specific intercepts and slopes of the association between "time" and eGFR. The random intercept represents the variation among the participants in their baseline level of eGFR. The random slope term for "time" captures the variation among the participants in their rate of change of eGFR over time. Therefore, LMM was fitted, allowing each participant to have their own baseline level of eGFR and rate of change in eGFR over time 13,17 . Quality control and EWAS. We performed quality control of DNA methylation data extracted from raw intensity data (IDAT), including the signal intensities for each of the probes on the chip with over 1 million probes. To minimize the unintended variation within and between samples, we implemented quantile normalization, which considers the methylated and unmethylated signal intensities separately. We excluded probes with a detection P-value of > 0.05, which can be considered a low-quality signal from all samples. The detection P-value was calculated using the "m + u" method, which compares the total DNA signal (methylated + unmethylated) at each site to the background signal level which is estimated using negative control sites, assuming a normal distribution 18 . CpG sites that failed in one or more samples are filtered based on the detection P-value. We removed the probes on the X or Y chromosome, in addition to the probes affected by single nucleotide polymorphisms (SNPs) without the specification of a certain minor allele frequency 19 . In addition, non-specific binding probes that mapped to multiple locations on the genome were filtered 20 . The annotation was performed by an Illumina Infinium MethylationEPIC BeadChip (EPIC chip), which is a microarray platform designed to DNA methylation across over 860,000 CpG sites in human genome. Finally, 784,864 out of 1,051,815 probes passed the quality control and were included in the EWAS ( Supplementary Fig. 3).
CpG sites associated with CKD progression were identified using linear regression models implemented in the limma package in R with an empirical Bayesian framework 21 . The methylation levels at each CpG probe are represented as M-values. The beta-value is a commonly used measure of DNA methylation that ranges from 0 to 1, representing the proportion of methylation at a given CpG site 22 . Conversely, the M-value, or logit-transformed beta-value, is the log2 ratio of the intensities of methylated versus unmethylated probes, calculated as 22 : The M-value has the advantage of being symmetrical around zero, and it is often used in statistical analyses, as it allows for more accurate measurement of differential methylation between groups. The M-value ranges from − ∞ to ∞, with values close to zero indicating low methylation and increasingly negative or positive values indicating higher levels of hypomethylation or hypermethylation, respectively 22 .
Since the differences in the various cell types of the whole blood between progression and non-progression can lead to false differentiated methylation regions, the effects of cell proportion on the results of EWAS should be considered 23 . Therefore, in addition to the original model without adjustment for the blood cell proportions (referred to as Model 1), we have also adjusted for blood cell proportions, including T lymphocytes, B cells, monocytes, NK cells, and neutrophils, to remove the false CpG sites from the differences of cell proportions based on the Houseman method (referred to as Model 2). Furthermore, we adjusted for body mass index (BMI) and smoking status (yes/no) as covariates (referred to as Model 3). All summary statistics are provided in Supplementary Table 2.
All results in this study are methylation differences in the primary outcome, diabetic CKD progression versus non-progression. Following the implementation of an epigenome-wide significance threshold of < 0.05 using the false discovery rate (FDR) 24 , the number of CpG sites were reduced from a total of 784,864 to 9,809, 8,900, and 8,690 in Model 1, Model 2, and Model 3, respectively (Supplementary Fig. 3; Supplementary Fig. 4).
Subsequently, CpG sites without an annotation for gene symbols were removed (7252, 6537, and 6328 CpG sites in the Model 1, Model 2, and Model 3, respectively). We only selected CpG sites located in the promoter regions (promoters were defined as regions located between 0 and 1500 bp upstream of transcriptional start sites (TSS), 5′UTR, and the 1st exon). There were 3837, 3462, and 3261 CpG sites in the Model 1, Model 2, and Model 3, respectively. In addition, CpG sites located in the shelf and shore regions of the CpG island (CGI) were selected (843, 774, and 742 CpG sites in the Model 1, Model 2, and Model 3, respectively). Furthermore, CpG sites with a more restricted FDR threshold (FDR < 0.005) were selected to perform pyrosequencing and in-silico functional analysis (197, 157, and 157 CpG sites in the Model 1, Model 2, and Model 3, respectively).
We used the top five percentile |∆ M-value| as the threshold in the distribution of |∆ M-value| to exclude false positive CpG sites. Since |∆ M-value| has a left skewed distribution, we determined the top five percentile values based on a non-parametric bootstrapping resampling method (alpha = 0.05, the number of resampling = 1000) Pyrosequencing: replication analysis. The 17 candidate CpG sites used in pyrosequencing analysis were selected from the results of EWAS with no adjustment for blood cell proportions (Supplementary Fig. 3; Supplementary Fig. 6). The pyrosequencing primer was designed using the PyroMark Assay Design SW 2.0 software (QIAGEN) under the following three conditions: (1) maximum amplicon length < 200 bp, (2) primer set score ≥ 75, and (3) primers attached to CpG sites were excluded ( Supplementary Fig. 6). Ultimately, only 11 out of 17 CpG sites (cg11513352, cg10297223, cg22773662, cg03503634, cg04089320, cg20746451, cg14279121, cg15280188, cg11508872, cg21285133, and cg02990553 within genes DOC2A, AGTR1 (Angiotensin II receptor type 1), MIEF1, TRAF6, EMB, SMARCAD1, OSBPL9, ASPSCR1, RAB14, ANP32E, and KRT28 (Keratin 28), respectively) were available to undergo pyrosequencing analysis under these three conditions. The primer sequences used in this study are listed in Supplementary Table 3.
For each assay, bisulfite-converted DNA was amplified using PCR, using the instructions provided by the manufacture of by the PyroMArk PCR kit (QIAGEN). The PCR product was bound to magnetic streptavidin beads. Quality control of the pyrosequencing data was performed using the PyroMark Q48 software. All samples passed the quality control process. Sequencing was performed on a PyroMark Q48 Autoprep system using the PyroMark Q48 Advanced CpG Reagents (QIAGEN) according to the manufacturer's instructions.
The percentage of DNA methylation at specific CpG sites was estimated using the PyroMark Q48 Autoprep 2.4.2 software (QIAGEN) and exported to the R statistical environment. Subsequently, linear regressions were performed for each CpG site covered by the assay, as well as for the average methylation value across the region.
We performed a linear regression analysis with the average methylation level of methylated cytosines as the dependent variable and progression/non-progression as the independent variable to select CpG sites that show differential methylation levels between these two groups (progression/non-progression) 27 . The beta estimation in the regression was used to calculate the difference in methylation levels between the two groups (Supplementary Table 3).
Phenome-wide association study. We performed the PheWAS for cg10297223 and cg02990553 CpG sites based on the variables (phenotypes) in KNOW-CKD cohort. Based on a total of 1,028 variables in KNOW-CKD cohort, we excluded 719 variables with a missing rate over 10%. Of the remaining 309 variables, including in the PheWAS, 144 were continuous and 165 were categorical variables. The association between each |M-value| of the CpG sites and the phenotype was estimated using linear or logistic regression models according to the continuous or categorical phenotypes, respectively. The statistical significance threshold for PheWAS was also set at FDR < 0.05 using the Benjamini-Hochberg method 24 . In silico functional analysis. We further performed functional annotation analysis, such as the analysis of disease-gene network (DGN), reactome (RA) pathways, and protein-protein interaction (PPI) network, to identify the biological mechanisms of CpG sites. DNG has been used to identify cross-phenotypes associated with selected genes from CpG sites using DisGeNET 28 . We also used the RA database to annotate gene sets for biological pathways 29 . The PPI network was constructed using the Search Tool for the Retrieval of Interacting Genes (STRING; http:// string. embl. de/) with a confidence score ≥ 0.99 to identify the functional interactions between proteins 30 . Statistical significance was determined by a false discovery rate (FDR)-corrected P-value of < 0.05. Furthermore, we identified the Expression Quantitative Trait Methylation (eQTM) based on a human whole-blood epigenome-wide association study from the Human Kidney eQTM by Susztak Lab (available on https:// suszt aklab. com/ Kidney_ meQTL/ index. php) 31 . Network illustrations from the functional analyses were constructed using the Cytoscape software (version 3.9.1) via Rcy3 32,33 .
DNA methyltransferase (DNMT) inhibitor treatment. HEK 293 cells were cultured in Dulbecco's modified Eagle's medium (Thermo Fisher Scientific, Waltham, MA, USA) supplemented with 10% fetal bovine serum (FBS, Thermo Fisher Scientific), 100 U/ml penicillin (Thermo Fisher Scientific), and 100 μg/ml streptomycin (Thermo Fisher Scientific) in an atmosphere containing 95% humidified air and 5% CO 2 at 37 °C. To demethylate methylated CpG sites, HEK 293 cells were treated with increasing concentrations (0, 5, 10, and 20 μM) of 5-aza-2′-deoxycytidine (Sigma-Aldrich, St. Louis, MO, USA) for 72 h, which was replaced daily. Inhibition of methylation was examined by pyrosequencing analysis, and changes in AGTR1 expression were measured by reverse-transcription quantitative polymerase chain reaction (RT-qPCR). The reactions were run on a 7500 Fast Real-Time PCR System (Applied Biosystems, Foster City, CA, USA) at 95 °C for 30 s, followed by 40 cycles at 95 °C for 3 s and 60 °C for 30 s, and a single cycle at 95 °C for 15 s, 60 °C for 60 s, and 95 °C for 15 s to generate dissociation curves. All PCR reactions were performed in triplicate, and the specificity of the reaction was determined by melting curve analysis. Comparative quantification of each target gene was performed based on the cycle threshold (Ct) normalized to GAPDH, using the ΔΔCt method 34 .

Results
General characteristics of study population. Table 1 shows the clinical and demographic characteristics of 180 diabetic participants with CKD (93 progression versus 87 non-progression) based on the KNOW-CKD cohort study. The mean age was 59.1 years, men accounted for 65% of the participants, and most participants had hypertension (98.3%). Urine albumin, urine protein, UACR, UPCR, 24-h urine protein, and 24-h urine phosphorus levels were higher in the progression group than in the non-progression group (Table 1).
Furthermore, Supplementary Table 1 shows the general characteristics of 78 DN from the KNOW-CKD cohort and 55 from the biopsy in SNUH Human Biobank for pyrosequencing analysis. Diastolic blood pressure (BP), Urine albumin, urine protein, UACR, UPCR, and 24-h urine protein levels were higher in the progression group than in the non-progression group among the 78 participants. However, only a limited number of variables were investigated among the 55 participants.
The results of pyrosequencing provided information about the proportion of methylated cytosines in each DNA sample, as well as information about the average level of methylation at individual CpG sites 35 . Of the 11 candidate CpG sites available for pyrosequencing primer design, six genes (DOC2A, MIEF1, EMB, SMARCAD1, ASPSCR1, and ANP32E) could not be considered validated due to the inconsistent direction of effect size with the discovery results (Supplementary Table 3 (Table 2; Fig. 2). In addition, both cg10297223 and cg02990553 were associated with seven phenotypes (24-h urine protein, 24-h urine phosphorus, urine albumin, urine protein, UPCR, UACR, and eGFR slope) based on the PheWAS (Table 3).
In silico functional analysis. Based on the functional analysis of the DGN, AGTR1 and STOX1 were associated with elevated systolic (FDR = 3.22E−02) and diastolic BP (FDR = 3.22E−02) (Fig. 3). In addition, AGTR1 and NT5C2 were associated with pre-hypertension (FDR = 3.53E−02), and AGTR1 and KCNC4 were associated with adverse events associated with cardiac arrhythmia (FDR = 3.59E−02). AGTR1 was also associated with TRAF6 based on the PPI network. KRT28 was involved in the biological pathways of developmental biology, keratinization, and the formation of the cornified envelope (FDR = 3.39E−39) based on the RA pathways ( Fig. 3; Supplementary Table 4). In addition, four CpG sites (g15280188, cg14279121, cg04089320, and cg11513352) among a total of 17 top CpG sites were identified as eQTM (Supplementary Table 5).

AGTR1 expression was regulated by epigenetic DNA methylation.
To determine whether the expression of AGTR1 mRNA was epigenetically modulated, we treated HEK 293 cells with the DNA methyltransferase inhibitor 5-aza-2′-deoxycytidine. The expression of AGTR1 mRNA was quantified by RT-qPCR and the methylation status of the CpG site (cg10297223) within AGTR1 was determined by pyrosequencing analysis. After treatment with 5-aza-2′-deoxycytidine, the expression of AGTR1 mRNA was significantly restored (~ 1.67-fold) in a dose-dependent manner, which occurred concurrently with the decreased methylation status of the AGTR1 promoter CpG site (Fig. 4). These results indicate that AGTR1 expression is regulated by a DNA methylation-dependent mechanism.

Discussion
In our study, EWAS was performed to select CpG sites that were differentially methylated during the progression of diabetic CKD in the Korean population. External replication analysis was performed using pyrosequencing, focusing on the top-ranked candidate CpG sites. Consequently, cg10297223 on AGTR1 and cg02990553 on KRT28 were found to be significant CpG markers, and gene-level functional analysis was performed to confirm that the two CpG sites share biological mechanisms with diabetic CKD progression based on existing knowledge or hypotheses.
The AGTR1 is a G-protein-coupled transmembrane receptor located at the end of the renin-angiotensin-aldosterone system (RAAS) cascade 36 . The RAAS cascade is a major regulator of systemic arterial blood www.nature.com/scientificreports/ pressure, fluid, and electrolyte balance 37 , primarily functions in the second stage of the embryo, and plays essential roles in neonates, maintenance of peripheral vascular resistance, and renal blood flow 38 .
In addition, the interaction between AGTR1 and angiotensin II, which is released from mesangial cells, has been demonstrated to activate the inflammatory cascade by regulating protein kinase C and the mitogen-activated  39 and induce the expression of growth factors and proliferative cytokines to sustain the generation of nephrotoxic reactive oxygen, resulting in inflammation, fibroblast formation, and collagen deposition 40 . Angiotensin II also activates signaling of the NF-κB pathways, which are activated by TNF-receptor-associated factor (TRAF), leading to inflammation [41][42][43] . Therefore, it has been suggested that pathogenic mutations leading to the absence or defects in AGTR1 can induce fatal phenotypes 44 . Moreover, chronic activation of the RAAS is recognized as a critical factor in CKD  Table 3. Phenome-wide association study based on M-values of cg10297223 (AGTR1) and cg02990553 (KRT28). FDR, false discovery rate; UPCR, urinary protein-to-creatinine ratio; UACR, urinary albumin-toprotein ratio; eGFR, estimated glomerular filtration rate.

Phenotypes
Effect size P-value FDR A study on AGTR1-related CKD in RAAS at the GWAS level was also reported in a systematic review and meta-analysis 48 . In the Chronic Renal Insufficiency Cohort study based on Caucasian and African American populations in 2015, the association between RAAS-related genes with CKD was reported, but AGTR1 was not found to be significantly associated with CKD 49 . However, another GWA study in an African American population reported an association between AGTR1 and diabetic ESKD 50 . Nevertheless, previous studies have only  After treatment, demethylation of AGTR1 promoter CpG site (cg10297223) was confirmed by pyrosequencing analysis (a) and the expression of AGTR1 mRNA was measured by RT-qPCR (b). Data are presented as the mean ± SD from three independent experiments. Statistical analyses were performed using one-way ANOVA with Dunnett's multiple comparison post-test for comparing significance with untreated control (*P < 0.05, **P < 0.01, ***P < 0.001). 5-aza, 5-aza-2′deoxycytidine. www.nature.com/scientificreports/ reported an association between the alleles of SNPs in AGTR1 and CKD, and studies on gene activation, such as gene expression or methylation of AGTR1, especially in the Korean population, have not yet been described.
KRT28 encodes a member of the type I (acidic) keratin family, which belongs to the superfamily of intermediate filament (IF) proteins 51 . Previous studies have reported that a cornified envelope or keratinization, which is associated with KRT28, is also associated with CKD.
The components of the cornified envelope were considerably reduced in participants with CKD compared to that of the control group in a previous study 52 . It has been reported that treatment with emollients can reduce the thickness and density of scales and noticeably improve the quality of life of CKD 53 .
Moreover, acquired perforating dermatosis (APD), which is caused by chronic friction leading to epithelial proliferation, abnormal keratinization, and decreased blood supply due to microangiopathy, is often associated with underlying systemic diseases, such as diabetes mellitus and CKD 54 . APD most often occurs after starting dialysis in CKD 55 . Kidney damage is known to affect wound healing 56 . Research data on rats also showed an exacerbating effect of CKD on wound healing, which is mediated by the disruption of keratinization and delayed granulation 56 . In addition, veiled chronic inflammatory conditions, low rates of angiogenesis, and cell proliferation also contribute to poor wound healing 57 . Although several KRT series genes related to keratinization have been reported in previous studies, KRT28 (particularly as an epigenetic marker in Korean populations) was implicated for the first time in our study 58 .
Previous studies based on similar hypotheses have been reported for populations other than Koreans. The previous study has reported the enhancement of renal regulatory regions and their correlation with gene expression changes, including epidermal growth factor, related to kidney damage and impaired function using methylation probes 59 . Another study reported similar correlation results for individuals receiving kidney transplants or dialysis, demonstrating the ability to analyze transplant recipients alongside individuals receiving dialysis to improve the performance of future EWAS for ESKD 60 .
Our study had several limitations that need to be acknowledged. First, although we selected candidate CpG sites for performing external replication analysis using pyrosequencing, epigenome-wide replication analysis could not be performed owing to the lack of Korean or Asian-based CKD cohorts with epigenomic databases. Nevertheless, since the KNOW-CKD cohort, which forms the basis of the current study, has almost completed the recruitment of an additional 1500 CKD participants for phase II and has started follow-up (https:// clini caltr ials. gov/ ct2/ show/ NCT03 929900), we will be able to conduct epigenome-wide validation analysis in the future 13 .
Second, because the DNA samples used in our study were derived from peripheral blood samples, there is limited information on the association between whole blood DNA methylation profiles and kidney tissue-specific DNA methylation differentiation, in part due to the heterogeneity of cell types within the kidney. However, a previous study suggested that blood DNA methylation analysis is valuable because it can reflect changes in DNA methylation in the tissues associated with the phenotypes 61 . Nevertheless, the establishment of a biobank of kidney biopsies is needed to improve tissue-specific DNA methylation analysis for kidney disease in the future 61 .
Although we used the threshold of |∆ M-value| as 0.3 and 0.0038 for EWAS with/without adjustment of blood cell proportions and with adjustment of blood cell proportions, BMI, and smoking status, respectively, in order to exclude false positive CpG probes, there is a possibility that CpG probes with a small difference were not considered as candidate CpG sites for the validation due to the high threshold of the |∆ M-value|.
There was a possibility that cg10297223 and cg02990553 could be validated in the pyrosequencing analysis in our study. However, although cg10297223 had a significant raw P-value (P-value < 0.05), both CpG sites had no statistical significance in FDR or Bonferroni correction (Supplementary Table 3). In future, we hope to validate this finding, utilizing the KNOW-CKD phase II cohort which, as already mentioned, will include an additional 1500 CKD participants-the recruitment of these participants is almost completed 13 .
Since the Human Kidney eQTM results gathered by Susztak Lab were sampled from a different ethnic group than the ethnicity of the cohort used in our study 31 , there may be an association between Korean-specific epigenetics markers and gene expression that has not yet been identified.
Furthermore, the use of 5-aza-2′-deoxycytidine results in the demethylation of CpGs throughout the genome of cells, making it challenging to apply this treatment for the causal analysis of effects. Therefore, caution should be exercised when interpreting the results, as the observed changes may not be directly attributable to the targeted CpG sites 62 .
Despite these limitations, our study had several strengths. First, although epigenome-wide replication analysis could not be performed, functional annotation analysis was conducted in silico to elucidate the biological mechanisms of the CpG sites identified in our study. Moreover, based on PheWAS, CpG sites associated with diabetic CKD in our study were confirmed to be appreciably associated with different phenotypes related to CKD progression.
Second, although the DNA samples used in our study were whole blood DNA samples, a gene or that of the same family reported as GWAS-level in previous studies was also identified in our findings. Moreover, our study demonstrated that epigenetic markers affect gene expression and proteomic production more in the central dogma than at the GWAS-level. Furthermore, since our CpG markers were extracted from a clinically accessible peripheral blood sample, they can be used as diagnostic markers in the future.
We have identified two epigenetic markers (cg10297223 on AGTR1 and cg02990553 on KRT28) that show a potential association with diabetic CKD progression in the Korean population. Based on functional annotations and PheWAS, both genes with CpG sites may offer insights into the activation of genetic markers in diabetic CKD, suggesting that cg10297223 and cg02990553 could be considered as potential clinical biomarkers. Nevertheless, further studies are necessary to validate the association between whole blood and kidney tissue-specific DNA methylation.